library(tidyverse)         # for graphing and data cleaning
library(tidymodels)        # for modeling
library(naniar)            # for analyzing missing values
library(vip)               # for variable importance plots
theme_set(theme_minimal()) # Lisa's favorite theme
hotels <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-02-11/hotels.csv')

When you finish the assignment, remove the # from the options chunk at the top, so that messages and warnings aren’t printed. If you are getting errors in your code, add error = TRUE so that the file knits. I would recommend not removing the # until you are completely finished.

Put it on GitHub!

From now on, GitHub should be part of your routine when doing assignments. I recommend making it part of your process anytime you are working in R, but I’ll make you show it’s part of your process for assignments.

Task: When you are finished with the assignment, post a link below to the GitHub repo for the assignment.

Machine Learning review and intro to tidymodels

Read through and follow along with the Machine Learning review with an intro to the tidymodels package posted on the Course Materials page.

Tasks:

  1. Read about the hotel booking data, hotels, on the Tidy Tuesday page it came from. There is also a link to an article from the original authors. The outcome we will be predicting is called is_canceled.
  • Without doing any analysis, what are some variables you think might be predictive and why?

deposit_type, previous_cancellations, and stays_in_week_nights could be predictive factors. If the booking is nonrefundable, the guests are less likely to cancel since they can’t get their money back. Also, if guest cancelled a lot of bookings before, they may have a higher chance to cancel the booking this time. Moreover, people who stay in week nights are more likely to travel for business and are less likely to cancel because the schedule is more fixed.

_ What are some problems that might exist with the data? You might think about how it was collected and who did the collecting.

Since not all the data is collected from bookings or change log database tables, mistakes can happen when merging other tables with existing ones. Moreover, we don’t the location of hotels and it is possible that all of them comes from a specific region and thus are not representitive.

  • If we construct a model, what type of conclusions will be able to draw from it? [?]

We would find factors that relate to higher or lower possibility of hotel cancellation.

  1. Create some exploratory plots or table summaries of the variables in the dataset. Be sure to also examine missing values or other interesting values. You may want to adjust the fig.width and fig.height in the code chunk options.
hotels %>% 
  select(where(is.numeric)) %>% 
  pivot_longer(cols = everything(),
               names_to = "variable", 
               values_to = "value") %>% 
  ggplot(aes(x = value)) +
  geom_histogram(bins = 30) +
  facet_wrap(vars(variable), 
             scales = "free")

hotels 
  1. First, we will do a couple things to get the data ready.
  • I did the following for you: made outcome a factor (needs to be that way for logistic regression), made all character variables factoors, removed the year variable and some reservation status variables, and removed cases with missing values (not NULLs but true missing values).

  • You need to split the data into a training and test set, stratifying on the outcome variable, is_canceled. Since we have a lot of data, split the data 50/50 between training and test. I have already set.seed() for you. Be sure to use hotels_mod in the splitting.

hotels_mod <- hotels %>% 
  mutate(is_canceled = as.factor(is_canceled)) %>% 
  mutate(across(where(is.character), as.factor)) %>% 
  select(-arrival_date_year,
         -reservation_status,
         -reservation_status_date) %>% 
  add_n_miss() %>% 
  filter(n_miss_all == 0) %>% 
  select(-n_miss_all)

set.seed(494)
  1. In this next step, we are going to do the pre-processing. Usually, I won’t tell you exactly what to do here, but for your first exercise, I’ll tell you the steps.
  • Set up the recipe with is_canceled as the outcome and all other variables as predictors (HINT: ~.).
  • Use a step_XXX() function or functions (I think there are other ways to do this, but I found step_mutate_at() easiest) to create some indicator variables for the following variables: children, babies, and previous_cancellations. So, the new variable should be a 1 if the original is more than 0 and 0 otherwise. Make sure you do this in a way that accounts for values that may be larger than any we see in the dataset.
  • For the agent and company variables, make new indicator variables that are 1 if they have a value of NULL and 0 otherwise. I also used step_mutate_at() for this, but there’s more ways you could do it.
  • Use fct_lump_n() inside step_mutate() to lump together countries that aren’t in the top 5 most occurring.
  • If you used new names for some of the new variables you created, then remove any variables that are no longer needed.
  • Use step_normalize() to center and scale all the non-categorical predictor variables. (Do this BEFORE creating dummy variables. When I tried to do it after, I ran into an error - I’m still investigating why.)
  • Create dummy variables for all factors/categorical predictor variables (make sure you have -all_outcomes() in this part!!).
  • Use the prep() and juice() functions to apply the steps to the training data just to check that everything went as planned.
  1. In this step we will set up a LASSO model and workflow.
  • In general, why would we want to use LASSO instead of regular logistic regression? (HINT: think about what happens to the coefficients).
  • Define the model type, set the engine, set the penalty argument to tune() as a placeholder, and set the mode.
  • Create a workflow with the recipe and model.
  1. In this step, we’ll tune the model and fit the model using the best tuning parameter to the entire training dataset.
  • Create a 5-fold cross-validation sample. We’ll use this later. I have set the seed for you.
  • Use the grid_regular() function to create a grid of 10 potential penalty parameters (we’re keeping this sort of small because the dataset is pretty large). Use that with the 5-fold cv data to tune the model.
  • Use the tune_grid() function to fit the models with different tuning parameters to the different cross-validation sets.
  • Use the collect_metrics() function to collect all the metrics from the previous step and create a plot with the accuracy on the y-axis and the penalty term on the x-axis. Put the x-axis on the log scale.
  • Use the select_best() function to find the best tuning parameter, fit the model using that tuning parameter to the entire training set (HINT: finalize_workflow() and fit()), and display the model results using pull_workflow_fit() and tidy(). Are there some variables with coefficients of 0?
set.seed(494) # for reproducibility
  1. Now that we have a model, let’s evaluate it a bit more. All we have looked at so far is the cross-validated accuracy from the previous step.
  • Create a variable importance graph. Which variables show up as the most important? Are you surprised?
  • Use the last_fit() function to fit the final model and then apply it to the testing data. Report the metrics from the testing data using the collet_metrics() function. How do they compare to the cross-validated metrics?
  • Use the collect_predictions() function to find the predicted probabilities and classes for the test data. Save this to a new dataset called preds. Then, use the conf_mat() function from dials (part of tidymodels) to create a confusion matrix showing the predicted classes vs. the true classes. Compute the true positive rate (sensitivity), true negative rate (specificity), and accuracy. See this Wikipedia reference if you (like me) tend to forget these definitions. Also keep in mind that a “positive” in this case is a cancellation (those are the 1’s).
  • Use the preds dataset you just created to create a density plot of the predicted probabilities of canceling (the variable is called .pred_1), filling by is_canceled. Use an alpha = .5 and color = NA in the geom_density(). Answer these questions:
  1. What would this graph look like for a model with an accuracy that was close to 1?
  2. Our predictions are classified as canceled if their predicted probability of canceling is greater than .5. If we wanted to have a high true positive rate, should we make the cutoff for predicted as canceled higher or lower than .5?
  3. What happens to the true negative rate if we try to get a higher true positive rate?
  1. Let’s say that this model is going to be applied to bookings 14 days in advance of their arrival at each hotel, and someone who works for the hotel will make a phone call to the person who made the booking. During this phone call, they will try to assure that the person will be keeping their reservation or that they will be canceling in which case they can do that now and still have time to fill the room. How should the hotel go about deciding who to call? How could they measure whether it was worth the effort to do the calling? Can you think of another way they might use the model?

  2. How might you go about questioning and evaluating the model in terms of fairness? Are there any questions you would like to ask of the people who collected the data?

Bias and Fairness

Read Chapter 1: The Power Chapter of Data Feminism by Catherine D’Ignazio and Lauren Klein. Write a 4-6 sentence paragraph reflecting on this chapter. As you reflect, you might consider responding to these specific questions. We will also have a discussion about these questions in class on Thursday.

  • At the end of the “Matrix of Domination” section, they encourage us to “ask uncomfortable questions: who is doing the work of data science (and who is not)? Whose goals are prioritized in data science (and whose are not)? And who benefits from data science (and who is either overlooked or actively harmed)?” In general, how would you answer these questions? And why are they important?
  • Can you think of any examples of missing datasets, like those described in the “Data Science for Whom?” section? Or was there an example there that surprised you?
  • How did the examples in the “Data Science with Whose Interests and Goals?” section make you feel? What responsibility do companies have to prevent these things from occurring? Who is to blame?
LS0tCnRpdGxlOiAnQXNzaWdubWVudCAjMicKb3V0cHV0OiAKICBodG1sX2RvY3VtZW50OgogICAgdG9jOiB0cnVlCiAgICB0b2NfZmxvYXQ6IHRydWUKICAgIGRmX3ByaW50OiBwYWdlZAogICAgY29kZV9kb3dubG9hZDogdHJ1ZQotLS0KCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQprbml0cjo6b3B0c19jaHVuayRzZXQoZWNobyA9IFRSVUUsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0UpCmBgYAoKYGBge3IgbGlicmFyaWVzfQpsaWJyYXJ5KHRpZHl2ZXJzZSkgICAgICAgICAjIGZvciBncmFwaGluZyBhbmQgZGF0YSBjbGVhbmluZwpsaWJyYXJ5KHRpZHltb2RlbHMpICAgICAgICAjIGZvciBtb2RlbGluZwpsaWJyYXJ5KG5hbmlhcikgICAgICAgICAgICAjIGZvciBhbmFseXppbmcgbWlzc2luZyB2YWx1ZXMKbGlicmFyeSh2aXApICAgICAgICAgICAgICAgIyBmb3IgdmFyaWFibGUgaW1wb3J0YW5jZSBwbG90cwp0aGVtZV9zZXQodGhlbWVfbWluaW1hbCgpKSAjIExpc2EncyBmYXZvcml0ZSB0aGVtZQpgYGAKCmBgYHtyIGRhdGF9CmhvdGVscyA8LSByZWFkcjo6cmVhZF9jc3YoJ2h0dHBzOi8vcmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbS9yZm9yZGF0YXNjaWVuY2UvdGlkeXR1ZXNkYXkvbWFzdGVyL2RhdGEvMjAyMC8yMDIwLTAyLTExL2hvdGVscy5jc3YnKQpgYGAKCgpXaGVuIHlvdSBmaW5pc2ggdGhlIGFzc2lnbm1lbnQsIHJlbW92ZSB0aGUgYCNgIGZyb20gdGhlIG9wdGlvbnMgY2h1bmsgYXQgdGhlIHRvcCwgc28gdGhhdCBtZXNzYWdlcyBhbmQgd2FybmluZ3MgYXJlbid0IHByaW50ZWQuIElmIHlvdSBhcmUgZ2V0dGluZyBlcnJvcnMgaW4geW91ciBjb2RlLCBhZGQgYGVycm9yID0gVFJVRWAgc28gdGhhdCB0aGUgZmlsZSBrbml0cy4gSSB3b3VsZCByZWNvbW1lbmQgbm90IHJlbW92aW5nIHRoZSBgI2AgdW50aWwgeW91IGFyZSBjb21wbGV0ZWx5IGZpbmlzaGVkLgoKIyMgUHV0IGl0IG9uIEdpdEh1YiEgICAgICAgIAoKRnJvbSBub3cgb24sIEdpdEh1YiBzaG91bGQgYmUgcGFydCBvZiB5b3VyIHJvdXRpbmUgd2hlbiBkb2luZyBhc3NpZ25tZW50cy4gSSByZWNvbW1lbmQgbWFraW5nIGl0IHBhcnQgb2YgeW91ciBwcm9jZXNzIGFueXRpbWUgeW91IGFyZSB3b3JraW5nIGluIFIsIGJ1dCBJJ2xsIG1ha2UgeW91IHNob3cgaXQncyBwYXJ0IG9mIHlvdXIgcHJvY2VzcyBmb3IgYXNzaWdubWVudHMuCgoqKlRhc2sqKjogV2hlbiB5b3UgYXJlIGZpbmlzaGVkIHdpdGggdGhlIGFzc2lnbm1lbnQsIHBvc3QgYSBsaW5rIGJlbG93IHRvIHRoZSBHaXRIdWIgcmVwbyBmb3IgdGhlIGFzc2lnbm1lbnQuIAoKCiMjIE1hY2hpbmUgTGVhcm5pbmcgcmV2aWV3IGFuZCBpbnRybyB0byBgdGlkeW1vZGVsc2AKClJlYWQgdGhyb3VnaCBhbmQgZm9sbG93IGFsb25nIHdpdGggdGhlIFtNYWNoaW5lIExlYXJuaW5nIHJldmlldyB3aXRoIGFuIGludHJvIHRvIHRoZSBgdGlkeW1vZGVsc2AgcGFja2FnZV0oaHR0cHM6Ly9hZHZhbmNlZC1kcy1pbi1yLm5ldGxpZnkuYXBwL3Bvc3RzLzIwMjEtMDMtMTYtbWwtcmV2aWV3LykgcG9zdGVkIG9uIHRoZSBDb3Vyc2UgTWF0ZXJpYWxzIHBhZ2UuIAoKKipUYXNrcyoqOgoKMS4gUmVhZCBhYm91dCB0aGUgaG90ZWwgYm9va2luZyBkYXRhLCBgaG90ZWxzYCwgb24gdGhlIFtUaWR5IFR1ZXNkYXkgcGFnZV0oaHR0cHM6Ly9naXRodWIuY29tL3Jmb3JkYXRhc2NpZW5jZS90aWR5dHVlc2RheS9ibG9iL21hc3Rlci9kYXRhLzIwMjAvMjAyMC0wMi0xMS9yZWFkbWUubWQpIGl0IGNhbWUgZnJvbS4gVGhlcmUgaXMgYWxzbyBhIGxpbmsgdG8gYW4gYXJ0aWNsZSBmcm9tIHRoZSBvcmlnaW5hbCBhdXRob3JzLiBUaGUgb3V0Y29tZSB3ZSB3aWxsIGJlIHByZWRpY3RpbmcgaXMgY2FsbGVkIGBpc19jYW5jZWxlZGAuIAogIC0gV2l0aG91dCBkb2luZyBhbnkgYW5hbHlzaXMsIHdoYXQgYXJlIHNvbWUgdmFyaWFibGVzIHlvdSB0aGluayBtaWdodCBiZSBwcmVkaWN0aXZlIGFuZCB3aHk/ICAgIAoKICA+IGRlcG9zaXRfdHlwZSwgcHJldmlvdXNfY2FuY2VsbGF0aW9ucywgYW5kIHN0YXlzX2luX3dlZWtfbmlnaHRzIGNvdWxkIGJlIHByZWRpY3RpdmUgZmFjdG9ycy4gSWYgdGhlIGJvb2tpbmcgaXMgbm9ucmVmdW5kYWJsZSwgdGhlIGd1ZXN0cyBhcmUgbGVzcyBsaWtlbHkgdG8gY2FuY2VsIHNpbmNlIHRoZXkgY2FuJ3QgZ2V0IHRoZWlyIG1vbmV5IGJhY2suIEFsc28sIGlmIGd1ZXN0IGNhbmNlbGxlZCBhIGxvdCBvZiBib29raW5ncyBiZWZvcmUsIHRoZXkgbWF5IGhhdmUgYSBoaWdoZXIgY2hhbmNlIHRvIGNhbmNlbCB0aGUgYm9va2luZyB0aGlzIHRpbWUuIE1vcmVvdmVyLCBwZW9wbGUgd2hvIHN0YXkgaW4gd2VlayBuaWdodHMgYXJlIG1vcmUgbGlrZWx5IHRvIHRyYXZlbCBmb3IgYnVzaW5lc3MgYW5kIGFyZSBsZXNzIGxpa2VseSB0byBjYW5jZWwgYmVjYXVzZSB0aGUgc2NoZWR1bGUgaXMgbW9yZSBmaXhlZC4gCgogIF8gV2hhdCBhcmUgc29tZSBwcm9ibGVtcyB0aGF0IG1pZ2h0IGV4aXN0IHdpdGggdGhlIGRhdGE/IFlvdSBtaWdodCB0aGluayBhYm91dCBob3cgaXQgd2FzIGNvbGxlY3RlZCBhbmQgd2hvIGRpZCB0aGUgY29sbGVjdGluZy4gIAogIAogID4gU2luY2Ugbm90IGFsbCB0aGUgZGF0YSBpcyBjb2xsZWN0ZWQgZnJvbSBib29raW5ncyBvciBjaGFuZ2UgbG9nIGRhdGFiYXNlIHRhYmxlcywgbWlzdGFrZXMgY2FuIGhhcHBlbiB3aGVuIG1lcmdpbmcgb3RoZXIgdGFibGVzIHdpdGggZXhpc3Rpbmcgb25lcy4gTW9yZW92ZXIsIHdlIGRvbid0IHRoZSBsb2NhdGlvbiBvZiBob3RlbHMgYW5kIGl0IGlzIHBvc3NpYmxlIHRoYXQgYWxsIG9mIHRoZW0gY29tZXMgZnJvbSBhIHNwZWNpZmljIHJlZ2lvbiBhbmQgdGh1cyBhcmUgbm90IHJlcHJlc2VudGl0aXZlLiAKICAKICAtIElmIHdlIGNvbnN0cnVjdCBhIG1vZGVsLCB3aGF0IHR5cGUgb2YgY29uY2x1c2lvbnMgd2lsbCBiZSBhYmxlIHRvIGRyYXcgZnJvbSBpdD8gIFs/XQogIAogID4gV2Ugd291bGQgZmluZCBmYWN0b3JzIHRoYXQgcmVsYXRlIHRvIGhpZ2hlciBvciBsb3dlciBwb3NzaWJpbGl0eSBvZiBob3RlbCBjYW5jZWxsYXRpb24uIAogIAoyLiBDcmVhdGUgc29tZSBleHBsb3JhdG9yeSBwbG90cyBvciB0YWJsZSBzdW1tYXJpZXMgb2YgdGhlIHZhcmlhYmxlcyBpbiB0aGUgZGF0YXNldC4gQmUgc3VyZSB0byBhbHNvIGV4YW1pbmUgbWlzc2luZyB2YWx1ZXMgb3Igb3RoZXIgaW50ZXJlc3RpbmcgdmFsdWVzLiBZb3UgbWF5IHdhbnQgdG8gYWRqdXN0IHRoZSBgZmlnLndpZHRoYCBhbmQgYGZpZy5oZWlnaHRgIGluIHRoZSBjb2RlIGNodW5rIG9wdGlvbnMuICAKYGBge3J9CmhvdGVscyAlPiUgCiAgc2VsZWN0KHdoZXJlKGlzLm51bWVyaWMpKSAlPiUgCiAgcGl2b3RfbG9uZ2VyKGNvbHMgPSBldmVyeXRoaW5nKCksCiAgICAgICAgICAgICAgIG5hbWVzX3RvID0gInZhcmlhYmxlIiwgCiAgICAgICAgICAgICAgIHZhbHVlc190byA9ICJ2YWx1ZSIpICU+JSAKICBnZ3Bsb3QoYWVzKHggPSB2YWx1ZSkpICsKICBnZW9tX2hpc3RvZ3JhbShiaW5zID0gMzApICsKICBmYWNldF93cmFwKHZhcnModmFyaWFibGUpLCAKICAgICAgICAgICAgIHNjYWxlcyA9ICJmcmVlIikKYGBgCmBgYHtyfQpob3RlbHMgCmBgYAoKCjMuIEZpcnN0LCB3ZSB3aWxsIGRvIGEgY291cGxlIHRoaW5ncyB0byBnZXQgdGhlIGRhdGEgcmVhZHkuIAoKKiBJIGRpZCB0aGUgZm9sbG93aW5nIGZvciB5b3U6IG1hZGUgb3V0Y29tZSBhIGZhY3RvciAobmVlZHMgdG8gYmUgdGhhdCB3YXkgZm9yIGxvZ2lzdGljIHJlZ3Jlc3Npb24pLCBtYWRlIGFsbCBjaGFyYWN0ZXIgdmFyaWFibGVzIGZhY3Rvb3JzLCByZW1vdmVkIHRoZSB5ZWFyIHZhcmlhYmxlIGFuZCBzb21lIHJlc2VydmF0aW9uIHN0YXR1cyB2YXJpYWJsZXMsIGFuZCByZW1vdmVkIGNhc2VzIHdpdGggbWlzc2luZyB2YWx1ZXMgKG5vdCBOVUxMcyBidXQgdHJ1ZSBtaXNzaW5nIHZhbHVlcykuCgoqIFlvdSBuZWVkIHRvIHNwbGl0IHRoZSBkYXRhIGludG8gYSB0cmFpbmluZyBhbmQgdGVzdCBzZXQsIHN0cmF0aWZ5aW5nIG9uIHRoZSBvdXRjb21lIHZhcmlhYmxlLCBgaXNfY2FuY2VsZWRgLiBTaW5jZSB3ZSBoYXZlIGEgbG90IG9mIGRhdGEsIHNwbGl0IHRoZSBkYXRhIDUwLzUwIGJldHdlZW4gdHJhaW5pbmcgYW5kIHRlc3QuIEkgaGF2ZSBhbHJlYWR5IGBzZXQuc2VlZCgpYCBmb3IgeW91LiBCZSBzdXJlIHRvIHVzZSBgaG90ZWxzX21vZGAgaW4gdGhlIHNwbGl0dGluZy4KCmBgYHtyfQpob3RlbHNfbW9kIDwtIGhvdGVscyAlPiUgCiAgbXV0YXRlKGlzX2NhbmNlbGVkID0gYXMuZmFjdG9yKGlzX2NhbmNlbGVkKSkgJT4lIAogIG11dGF0ZShhY3Jvc3Mod2hlcmUoaXMuY2hhcmFjdGVyKSwgYXMuZmFjdG9yKSkgJT4lIAogIHNlbGVjdCgtYXJyaXZhbF9kYXRlX3llYXIsCiAgICAgICAgIC1yZXNlcnZhdGlvbl9zdGF0dXMsCiAgICAgICAgIC1yZXNlcnZhdGlvbl9zdGF0dXNfZGF0ZSkgJT4lIAogIGFkZF9uX21pc3MoKSAlPiUgCiAgZmlsdGVyKG5fbWlzc19hbGwgPT0gMCkgJT4lIAogIHNlbGVjdCgtbl9taXNzX2FsbCkKCnNldC5zZWVkKDQ5NCkKYGBgCgo0LiBJbiB0aGlzIG5leHQgc3RlcCwgd2UgYXJlIGdvaW5nIHRvIGRvIHRoZSBwcmUtcHJvY2Vzc2luZy4gVXN1YWxseSwgSSB3b24ndCB0ZWxsIHlvdSBleGFjdGx5IHdoYXQgdG8gZG8gaGVyZSwgYnV0IGZvciB5b3VyIGZpcnN0IGV4ZXJjaXNlLCBJJ2xsIHRlbGwgeW91IHRoZSBzdGVwcy4gCgoqIFNldCB1cCB0aGUgcmVjaXBlIHdpdGggYGlzX2NhbmNlbGVkYCBhcyB0aGUgb3V0Y29tZSBhbmQgYWxsIG90aGVyIHZhcmlhYmxlcyBhcyBwcmVkaWN0b3JzIChISU5UOiBgfi5gKS4gIAoqIFVzZSBhIGBzdGVwX1hYWCgpYCBmdW5jdGlvbiBvciBmdW5jdGlvbnMgKEkgdGhpbmsgdGhlcmUgYXJlIG90aGVyIHdheXMgdG8gZG8gdGhpcywgYnV0IEkgZm91bmQgYHN0ZXBfbXV0YXRlX2F0KClgIGVhc2llc3QpIHRvIGNyZWF0ZSBzb21lIGluZGljYXRvciB2YXJpYWJsZXMgZm9yIHRoZSBmb2xsb3dpbmcgdmFyaWFibGVzOiBgY2hpbGRyZW5gLCBgYmFiaWVzYCwgYW5kIGBwcmV2aW91c19jYW5jZWxsYXRpb25zYC4gU28sIHRoZSBuZXcgdmFyaWFibGUgc2hvdWxkIGJlIGEgMSBpZiB0aGUgb3JpZ2luYWwgaXMgbW9yZSB0aGFuIDAgYW5kIDAgb3RoZXJ3aXNlLiBNYWtlIHN1cmUgeW91IGRvIHRoaXMgaW4gYSB3YXkgdGhhdCBhY2NvdW50cyBmb3IgdmFsdWVzIHRoYXQgbWF5IGJlIGxhcmdlciB0aGFuIGFueSB3ZSBzZWUgaW4gdGhlIGRhdGFzZXQuICAKKiBGb3IgdGhlIGBhZ2VudGAgYW5kIGBjb21wYW55YCB2YXJpYWJsZXMsIG1ha2UgbmV3IGluZGljYXRvciB2YXJpYWJsZXMgdGhhdCBhcmUgMSBpZiB0aGV5IGhhdmUgYSB2YWx1ZSBvZiBgTlVMTGAgYW5kIDAgb3RoZXJ3aXNlLiBJIGFsc28gdXNlZCBgc3RlcF9tdXRhdGVfYXQoKWAgZm9yIHRoaXMsIGJ1dCB0aGVyZSdzIG1vcmUgd2F5cyB5b3UgY291bGQgZG8gaXQuCiogVXNlIGBmY3RfbHVtcF9uKClgIGluc2lkZSBgc3RlcF9tdXRhdGUoKWAgdG8gbHVtcCB0b2dldGhlciBjb3VudHJpZXMgdGhhdCBhcmVuJ3QgaW4gdGhlIHRvcCA1IG1vc3Qgb2NjdXJyaW5nLiAKKiBJZiB5b3UgdXNlZCBuZXcgbmFtZXMgZm9yIHNvbWUgb2YgdGhlIG5ldyB2YXJpYWJsZXMgeW91IGNyZWF0ZWQsIHRoZW4gcmVtb3ZlIGFueSB2YXJpYWJsZXMgdGhhdCBhcmUgbm8gbG9uZ2VyIG5lZWRlZC4gCiogVXNlIGBzdGVwX25vcm1hbGl6ZSgpYCB0byBjZW50ZXIgYW5kIHNjYWxlIGFsbCB0aGUgbm9uLWNhdGVnb3JpY2FsIHByZWRpY3RvciB2YXJpYWJsZXMuIChEbyB0aGlzIEJFRk9SRSBjcmVhdGluZyBkdW1teSB2YXJpYWJsZXMuIFdoZW4gSSB0cmllZCB0byBkbyBpdCBhZnRlciwgSSByYW4gaW50byBhbiBlcnJvciAtIEknbSBzdGlsbCBbaW52ZXN0aWdhdGluZ10oaHR0cHM6Ly9jb21tdW5pdHkucnN0dWRpby5jb20vdC90aWR5bW9kZWxzLXNlZS1ub3Rlcy1lcnJvci1idXQtb25seS13aXRoLXN0ZXAteHh4LWZ1bmN0aW9ucy1pbi1hLWNlcnRhaW4tb3JkZXIvMTE1MDA2KSB3aHkuKQoqIENyZWF0ZSBkdW1teSB2YXJpYWJsZXMgZm9yIGFsbCBmYWN0b3JzL2NhdGVnb3JpY2FsIHByZWRpY3RvciB2YXJpYWJsZXMgKG1ha2Ugc3VyZSB5b3UgaGF2ZSBgLWFsbF9vdXRjb21lcygpYCBpbiB0aGlzIHBhcnQhISkuICAKKiBVc2UgdGhlIGBwcmVwKClgIGFuZCBganVpY2UoKWAgZnVuY3Rpb25zIHRvIGFwcGx5IHRoZSBzdGVwcyB0byB0aGUgdHJhaW5pbmcgZGF0YSBqdXN0IHRvIGNoZWNrIHRoYXQgZXZlcnl0aGluZyB3ZW50IGFzIHBsYW5uZWQuCgo1LiBJbiB0aGlzIHN0ZXAgd2Ugd2lsbCBzZXQgdXAgYSBMQVNTTyBtb2RlbCBhbmQgd29ya2Zsb3cuCgoqIEluIGdlbmVyYWwsIHdoeSB3b3VsZCB3ZSB3YW50IHRvIHVzZSBMQVNTTyBpbnN0ZWFkIG9mIHJlZ3VsYXIgbG9naXN0aWMgcmVncmVzc2lvbj8gKEhJTlQ6IHRoaW5rIGFib3V0IHdoYXQgaGFwcGVucyB0byB0aGUgY29lZmZpY2llbnRzKS4gIAoqIERlZmluZSB0aGUgbW9kZWwgdHlwZSwgc2V0IHRoZSBlbmdpbmUsIHNldCB0aGUgYHBlbmFsdHlgIGFyZ3VtZW50IHRvIGB0dW5lKClgIGFzIGEgcGxhY2Vob2xkZXIsIGFuZCBzZXQgdGhlIG1vZGUuICAKKiBDcmVhdGUgYSB3b3JrZmxvdyB3aXRoIHRoZSByZWNpcGUgYW5kIG1vZGVsLiAgCgo2LiBJbiB0aGlzIHN0ZXAsIHdlJ2xsIHR1bmUgdGhlIG1vZGVsIGFuZCBmaXQgdGhlIG1vZGVsIHVzaW5nIHRoZSBiZXN0IHR1bmluZyBwYXJhbWV0ZXIgdG8gdGhlIGVudGlyZSB0cmFpbmluZyBkYXRhc2V0LgoKKiBDcmVhdGUgYSA1LWZvbGQgY3Jvc3MtdmFsaWRhdGlvbiBzYW1wbGUuIFdlJ2xsIHVzZSB0aGlzIGxhdGVyLiBJIGhhdmUgc2V0IHRoZSBzZWVkIGZvciB5b3UuICAKKiBVc2UgdGhlIGBncmlkX3JlZ3VsYXIoKWAgZnVuY3Rpb24gdG8gY3JlYXRlIGEgZ3JpZCBvZiAxMCBwb3RlbnRpYWwgcGVuYWx0eSBwYXJhbWV0ZXJzICh3ZSdyZSBrZWVwaW5nIHRoaXMgc29ydCBvZiBzbWFsbCBiZWNhdXNlIHRoZSBkYXRhc2V0IGlzIHByZXR0eSBsYXJnZSkuIFVzZSB0aGF0IHdpdGggdGhlIDUtZm9sZCBjdiBkYXRhIHRvIHR1bmUgdGhlIG1vZGVsLiAgCiogVXNlIHRoZSBgdHVuZV9ncmlkKClgIGZ1bmN0aW9uIHRvIGZpdCB0aGUgbW9kZWxzIHdpdGggZGlmZmVyZW50IHR1bmluZyBwYXJhbWV0ZXJzIHRvIHRoZSBkaWZmZXJlbnQgY3Jvc3MtdmFsaWRhdGlvbiBzZXRzLiAgCiogVXNlIHRoZSBgY29sbGVjdF9tZXRyaWNzKClgIGZ1bmN0aW9uIHRvIGNvbGxlY3QgYWxsIHRoZSBtZXRyaWNzIGZyb20gdGhlIHByZXZpb3VzIHN0ZXAgYW5kIGNyZWF0ZSBhIHBsb3Qgd2l0aCB0aGUgYWNjdXJhY3kgb24gdGhlIHktYXhpcyBhbmQgdGhlIHBlbmFsdHkgdGVybSBvbiB0aGUgeC1heGlzLiBQdXQgdGhlIHgtYXhpcyBvbiB0aGUgbG9nIHNjYWxlLiAgCiogVXNlIHRoZSBgc2VsZWN0X2Jlc3QoKWAgZnVuY3Rpb24gdG8gZmluZCB0aGUgYmVzdCB0dW5pbmcgcGFyYW1ldGVyLCBmaXQgdGhlIG1vZGVsIHVzaW5nIHRoYXQgdHVuaW5nIHBhcmFtZXRlciB0byB0aGUgZW50aXJlIHRyYWluaW5nIHNldCAoSElOVDogYGZpbmFsaXplX3dvcmtmbG93KClgIGFuZCBgZml0KClgKSwgYW5kIGRpc3BsYXkgdGhlIG1vZGVsIHJlc3VsdHMgdXNpbmcgYHB1bGxfd29ya2Zsb3dfZml0KClgIGFuZCBgdGlkeSgpYC4gQXJlIHRoZXJlIHNvbWUgdmFyaWFibGVzIHdpdGggY29lZmZpY2llbnRzIG9mIDA/CgpgYGB7cn0Kc2V0LnNlZWQoNDk0KSAjIGZvciByZXByb2R1Y2liaWxpdHkKCgpgYGAKCgo3LiBOb3cgdGhhdCB3ZSBoYXZlIGEgbW9kZWwsIGxldCdzIGV2YWx1YXRlIGl0IGEgYml0IG1vcmUuIEFsbCB3ZSBoYXZlIGxvb2tlZCBhdCBzbyBmYXIgaXMgdGhlIGNyb3NzLXZhbGlkYXRlZCBhY2N1cmFjeSBmcm9tIHRoZSBwcmV2aW91cyBzdGVwLiAKCiogQ3JlYXRlIGEgdmFyaWFibGUgaW1wb3J0YW5jZSBncmFwaC4gV2hpY2ggdmFyaWFibGVzIHNob3cgdXAgYXMgdGhlIG1vc3QgaW1wb3J0YW50PyBBcmUgeW91IHN1cnByaXNlZD8gIAoqIFVzZSB0aGUgYGxhc3RfZml0KClgIGZ1bmN0aW9uIHRvIGZpdCB0aGUgZmluYWwgbW9kZWwgYW5kIHRoZW4gYXBwbHkgaXQgdG8gdGhlIHRlc3RpbmcgZGF0YS4gUmVwb3J0IHRoZSBtZXRyaWNzIGZyb20gdGhlIHRlc3RpbmcgZGF0YSB1c2luZyB0aGUgYGNvbGxldF9tZXRyaWNzKClgIGZ1bmN0aW9uLiBIb3cgZG8gdGhleSBjb21wYXJlIHRvIHRoZSBjcm9zcy12YWxpZGF0ZWQgbWV0cmljcz8KKiBVc2UgdGhlIGBjb2xsZWN0X3ByZWRpY3Rpb25zKClgIGZ1bmN0aW9uIHRvIGZpbmQgdGhlIHByZWRpY3RlZCBwcm9iYWJpbGl0aWVzIGFuZCBjbGFzc2VzIGZvciB0aGUgdGVzdCBkYXRhLiBTYXZlIHRoaXMgdG8gYSBuZXcgZGF0YXNldCBjYWxsZWQgYHByZWRzYC4gVGhlbiwgdXNlIHRoZSBgY29uZl9tYXQoKWAgZnVuY3Rpb24gZnJvbSBgZGlhbHNgIChwYXJ0IG9mIGB0aWR5bW9kZWxzYCkgdG8gY3JlYXRlIGEgY29uZnVzaW9uIG1hdHJpeCBzaG93aW5nIHRoZSBwcmVkaWN0ZWQgY2xhc3NlcyB2cy4gdGhlIHRydWUgY2xhc3Nlcy4gQ29tcHV0ZSB0aGUgdHJ1ZSBwb3NpdGl2ZSByYXRlIChzZW5zaXRpdml0eSksIHRydWUgbmVnYXRpdmUgcmF0ZSAoc3BlY2lmaWNpdHkpLCBhbmQgYWNjdXJhY3kuIFNlZSB0aGlzIFtXaWtpcGVkaWFdKGh0dHBzOi8vZW4ud2lraXBlZGlhLm9yZy93aWtpL0NvbmZ1c2lvbl9tYXRyaXgpIHJlZmVyZW5jZSBpZiB5b3UgKGxpa2UgbWUpIHRlbmQgdG8gZm9yZ2V0IHRoZXNlIGRlZmluaXRpb25zLiBBbHNvIGtlZXAgaW4gbWluZCB0aGF0IGEgInBvc2l0aXZlIiBpbiB0aGlzIGNhc2UgaXMgYSBjYW5jZWxsYXRpb24gKHRob3NlIGFyZSB0aGUgMSdzKS4gICAgCiogVXNlIHRoZSBgcHJlZHNgIGRhdGFzZXQgeW91IGp1c3QgY3JlYXRlZCB0byBjcmVhdGUgYSBkZW5zaXR5IHBsb3Qgb2YgdGhlIHByZWRpY3RlZCBwcm9iYWJpbGl0aWVzIG9mIGNhbmNlbGluZyAodGhlIHZhcmlhYmxlIGlzIGNhbGxlZCBgLnByZWRfMWApLCBmaWxsaW5nIGJ5IGBpc19jYW5jZWxlZGAuIFVzZSBhbiBgYWxwaGEgPSAuNWAgYW5kIGBjb2xvciA9IE5BYCBpbiB0aGUgYGdlb21fZGVuc2l0eSgpYC4gQW5zd2VyIHRoZXNlIHF1ZXN0aW9uczogCmEuIFdoYXQgd291bGQgdGhpcyBncmFwaCBsb29rIGxpa2UgZm9yIGEgbW9kZWwgd2l0aCBhbiBhY2N1cmFjeSB0aGF0IHdhcyBjbG9zZSB0byAxPyAgCmIuIE91ciBwcmVkaWN0aW9ucyBhcmUgY2xhc3NpZmllZCBhcyBjYW5jZWxlZCBpZiB0aGVpciBwcmVkaWN0ZWQgcHJvYmFiaWxpdHkgb2YgY2FuY2VsaW5nIGlzIGdyZWF0ZXIgdGhhbiAuNS4gSWYgd2Ugd2FudGVkIHRvIGhhdmUgYSBoaWdoIHRydWUgcG9zaXRpdmUgcmF0ZSwgc2hvdWxkIHdlIG1ha2UgdGhlIGN1dG9mZiBmb3IgcHJlZGljdGVkIGFzIGNhbmNlbGVkIGhpZ2hlciBvciBsb3dlciB0aGFuIC41PyAgCmMuIFdoYXQgaGFwcGVucyB0byB0aGUgdHJ1ZSBuZWdhdGl2ZSByYXRlIGlmIHdlIHRyeSB0byBnZXQgYSBoaWdoZXIgdHJ1ZSBwb3NpdGl2ZSByYXRlPyAKCjguIExldCdzIHNheSB0aGF0IHRoaXMgbW9kZWwgaXMgZ29pbmcgdG8gYmUgYXBwbGllZCB0byBib29raW5ncyAxNCBkYXlzIGluIGFkdmFuY2Ugb2YgdGhlaXIgYXJyaXZhbCBhdCBlYWNoIGhvdGVsLCBhbmQgc29tZW9uZSB3aG8gd29ya3MgZm9yIHRoZSBob3RlbCB3aWxsIG1ha2UgYSBwaG9uZSBjYWxsIHRvIHRoZSBwZXJzb24gd2hvIG1hZGUgdGhlIGJvb2tpbmcuIER1cmluZyB0aGlzIHBob25lIGNhbGwsIHRoZXkgd2lsbCB0cnkgdG8gYXNzdXJlIHRoYXQgdGhlIHBlcnNvbiB3aWxsIGJlIGtlZXBpbmcgdGhlaXIgcmVzZXJ2YXRpb24gb3IgdGhhdCB0aGV5IHdpbGwgYmUgY2FuY2VsaW5nIGluIHdoaWNoIGNhc2UgdGhleSBjYW4gZG8gdGhhdCBub3cgYW5kIHN0aWxsIGhhdmUgdGltZSB0byBmaWxsIHRoZSByb29tLiBIb3cgc2hvdWxkIHRoZSBob3RlbCBnbyBhYm91dCBkZWNpZGluZyB3aG8gdG8gY2FsbD8gSG93IGNvdWxkIHRoZXkgbWVhc3VyZSB3aGV0aGVyIGl0IHdhcyB3b3J0aCB0aGUgZWZmb3J0IHRvIGRvIHRoZSBjYWxsaW5nPyBDYW4geW91IHRoaW5rIG9mIGFub3RoZXIgd2F5IHRoZXkgbWlnaHQgdXNlIHRoZSBtb2RlbD8gCgo5LiBIb3cgbWlnaHQgeW91IGdvIGFib3V0IHF1ZXN0aW9uaW5nIGFuZCBldmFsdWF0aW5nIHRoZSBtb2RlbCBpbiB0ZXJtcyBvZiBmYWlybmVzcz8gQXJlIHRoZXJlIGFueSBxdWVzdGlvbnMgeW91IHdvdWxkIGxpa2UgdG8gYXNrIG9mIHRoZSBwZW9wbGUgd2hvIGNvbGxlY3RlZCB0aGUgZGF0YT8gCgoKCiMjIEJpYXMgYW5kIEZhaXJuZXNzCgpSZWFkIFtDaGFwdGVyIDE6IFRoZSBQb3dlciBDaGFwdGVyXShodHRwczovL2RhdGEtZmVtaW5pc20ubWl0cHJlc3MubWl0LmVkdS9wdWIvdmk4b2J4aDcvcmVsZWFzZS80KSBvZiBEYXRhIEZlbWluaXNtIGJ5IENhdGhlcmluZSBEJ0lnbmF6aW8gYW5kIExhdXJlbiBLbGVpbi4gV3JpdGUgYSA0LTYgc2VudGVuY2UgcGFyYWdyYXBoIHJlZmxlY3Rpbmcgb24gdGhpcyBjaGFwdGVyLiBBcyB5b3UgcmVmbGVjdCwgeW91IG1pZ2h0IGNvbnNpZGVyIHJlc3BvbmRpbmcgdG8gdGhlc2Ugc3BlY2lmaWMgcXVlc3Rpb25zLiBXZSB3aWxsIGFsc28gaGF2ZSBhIGRpc2N1c3Npb24gYWJvdXQgdGhlc2UgcXVlc3Rpb25zIGluIGNsYXNzIG9uIFRodXJzZGF5LgoKKiBBdCB0aGUgZW5kIG9mIHRoZSAiTWF0cml4IG9mIERvbWluYXRpb24iIHNlY3Rpb24sIHRoZXkgZW5jb3VyYWdlIHVzIHRvICJhc2sgdW5jb21mb3J0YWJsZSBxdWVzdGlvbnM6IHdobyBpcyBkb2luZyB0aGUgd29yayBvZiBkYXRhIHNjaWVuY2UgKGFuZCB3aG8gaXMgbm90KT8gV2hvc2UgZ29hbHMgYXJlIHByaW9yaXRpemVkIGluIGRhdGEgc2NpZW5jZSAoYW5kIHdob3NlIGFyZSBub3QpPyBBbmQgd2hvIGJlbmVmaXRzIGZyb20gZGF0YSBzY2llbmNlIChhbmQgd2hvIGlzIGVpdGhlciBvdmVybG9va2VkIG9yIGFjdGl2ZWx5IGhhcm1lZCk/IiBJbiBnZW5lcmFsLCBob3cgd291bGQgeW91IGFuc3dlciB0aGVzZSBxdWVzdGlvbnM/IEFuZCB3aHkgYXJlIHRoZXkgaW1wb3J0YW50PyAgCiogQ2FuIHlvdSB0aGluayBvZiBhbnkgZXhhbXBsZXMgb2YgbWlzc2luZyBkYXRhc2V0cywgbGlrZSB0aG9zZSBkZXNjcmliZWQgaW4gdGhlICJEYXRhIFNjaWVuY2UgZm9yIFdob20/IiBzZWN0aW9uPyBPciB3YXMgdGhlcmUgYW4gZXhhbXBsZSB0aGVyZSB0aGF0IHN1cnByaXNlZCB5b3U/ICAKKiBIb3cgZGlkIHRoZSBleGFtcGxlcyBpbiB0aGUgIkRhdGEgU2NpZW5jZSB3aXRoIFdob3NlIEludGVyZXN0cyBhbmQgR29hbHM/IiBzZWN0aW9uIG1ha2UgeW91IGZlZWw/IFdoYXQgcmVzcG9uc2liaWxpdHkgZG8gY29tcGFuaWVzIGhhdmUgdG8gcHJldmVudCB0aGVzZSB0aGluZ3MgZnJvbSBvY2N1cnJpbmc/IFdobyBpcyB0byBibGFtZT8KCgo=